behavior tree
Bridging Probabilistic Inference and Behavior Trees: An Interactive Framework for Adaptive Multi-Robot Cooperation
Wang, Chaoran, Sun, Jingyuan, Zhang, Yanhui, Wu, Changju
This paper proposes an Interactive Inference Behavior Tree (IIBT) framework that integrates behavior trees (BTs) with active inference under the free energy principle for distributed multi-robot decision-making. The proposed IIBT node extends conventional BTs with probabilistic reasoning, enabling online joint planning and execution across multiple robots. It remains fully compatible with standard BT architectures, allowing seamless integration into existing multi-robot control systems. Within this framework, multi-robot cooperation is formulated as a free-energy minimization process, where each robot dynamically updates its preference matrix based on perceptual inputs and peer intentions, thereby achieving adaptive coordination in partially observable and dynamic environments. The proposed approach is validated through both simulation and real-world experiments, including a multi-robot maze navigation and a collaborative manipulation task, compared against traditional BTs(https://youtu.be/KX_oT3IDTf4). Experimental results demonstrate that the IIBT framework reduces BT node complexity by over 70%, while maintaining robust, interpretable, and adaptive cooperative behavior under environmental uncertainty.
Interpretable Robot Control via Structured Behavior Trees and Large Language Models
Chekam, Ingrid Maéva, Pastor-Martinez, Ines, Tourani, Ali, Millan-Romera, Jose Andres, Ribeiro, Laura, Soares, Pedro Miguel Bastos, Voos, Holger, Sanchez-Lopez, Jose Luis
With the increasing presence of intelligent robots in everyday life, the demand for reliable and straightforward Human-Robot Interaction (HRI) interfaces is rapidly rising. Traditional robot control paradigms require users to learn particular commands [1] or interact with the robots through rigid user interfaces, especially in unstructured environments [2]. However, recent works target more flexible and adaptive communication strategies, unlocking the full potential of autonomous agents in human-centered environments. Accordingly, advances in generative AI and Large Language Models (LLMs) reveal new opportunities for enabling seamless communication between humans and robots, where natural language is the primary means of communication [3]. Such models are powerful enough to comprehend given instructions and even "reason" about the demanded tasks, intentions, and environmental context [4]. When paired with robotic perception and control systems, LLMs enable users to intuitively instruct the robot to perform complex tasks such as following multiple objects [5], navigating through dynamic scenes [6], or interacting with specific items [7], all using natural dialogue. Furthermore, integrating multimodal capabilities, including vision and speech, enhances HRI by enabling more natural, context-aware communication and improving adaptability across tasks and environments [8].
Behavior Trees vs Executable Ontologies: a Comparative Analysis of Robot Control Paradigms
This paper compares two distinct approaches to modeling robotic behavior: imperative Behavior Trees (BTs) and declarative Executable Ontologies (EO), implemented through the boldsea framework. BTs structure behavior hierarchically using control-flow, whereas EO represents the domain as a temporal, event-based semantic graph driven by dataflow rules. We demonstrate that EO achieves comparable reactivity and modularity to BTs through a fundamentally different architecture: replacing polling-based tick execution with event-driven state propagation. We propose that EO offers an alternative framework, moving from procedural programming to semantic domain modeling, to address the semantic-process gap in traditional robotic control. EO supports runtime model modification, full temporal traceability, and a unified representation of data, logic, and interface - features that are difficult or sometimes impossible to achieve with BTs, although BTs excel in established, predictable scenarios. The comparison is grounded in a practical mobile manipulation task. This comparison highlights the respective operational strengths of each approach in dynamic, evolving robotic systems.
Compositional Coordination for Multi-Robot Teams with Large Language Models
Huang, Zhehui, Shi, Guangyao, Wu, Yuwei, Kumar, Vijay, Sukhatme, Gaurav S.
Abstract-- Multi-robot coordination has traditionally relied on a mission-specific and expert-driven pipeline, where natural language mission descriptions are manually translated by domain experts into mathematical formulation, algorithm design, and executable code. This conventional process is labor-intensive, inaccessible to non-experts, and inflexible to changes in mission requirements. Here, we propose LAN2CB (Language to Collective Behavior), a novel framework that leverages large language models (LLMs) to streamline and generalize the multi-robot coordination pipeline. LAN2CB transforms natural language (NL) mission descriptions into executable Python code for multi-robot systems through two core modules: (1) Mission Analysis, which parses mission descriptions into behavior trees, and (2) Code Generation, which leverages the behavior tree and a structured knowledge base to generate robot control code. We further introduce a dataset of natural language mission descriptions to support development and benchmarking. Experiments in both simulation and real-world environments demonstrate that LAN2CB enables robust and flexible multi-robot coordination from natural language, significantly reducing manual engineering effort and supporting broad generalization across diverse mission types.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Combining Reinforcement Learning and Behavior Trees for NPCs in Video Games with AMD Schola
Liu, Tian, Cann, Alex, Colbert, Ian, Saeedi, Mehdi
For example, a recent study [1] concludes that NPCs based on behavior trees (BTs) are still more viable than those based on machine learning (ML), calling for new approaches, strategies, and tooling to overcome the barrier to adoption. Additional work has also underscored the need for reusable and adjustable models [2], motivated by game developers' preferences to reuse previously developed assets, provided that reuse does not result in repetitive gameplay. Traditional BT approaches and modern RL techniques each have their respective strengths and limitations in video game development. BTs offer a structured and hierarchical method for managing NPC behaviors, enabling the design of complex systems with predictable outcomes given sufficient development time. However, this complexity can make multi-task BTs less engaging and cumbersome to develop [2]. Conversely, RL provides a dynamic and adaptive approach to decision making [3], allowing developers to guide an agent through trial-and-error. However, training generally-capable RL models remains a challenge, particularly due to reward shaping, negative task transfer [4, 5], and compute resource demands [6].
- Leisure & Entertainment > Games > Computer Games (1.00)
- Information Technology > Software (1.00)
LLM-HBT: Dynamic Behavior Tree Construction for Adaptive Coordination in Heterogeneous Robots
Wang, Chaoran, Sun, Jingyuan, Zhang, Yanhui, Zhang, Mingyu, Wu, Changju
Abstract-- We introduce a novel framework for automatic behavior tree (BT) construction in heterogeneous multi-robot systems, designed to address the challenges of adaptability and robustness in dynamic environments. Traditional robots are limited by fixed functional attributes and cannot efficiently reconfigure their strategies in response to task failures or environmental changes. T o overcome this limitation, we leverage large language models (LLMs) to generate and extend BTs dynamically, combining the reasoning and generalization power of LLMs with the modularity and recovery capability of BTs. The proposed framework consists of four interconnected modules--task initialization, task assignment, BT update, and failure node detection--which operate in a closed loop. Robots tick their BTs during execution, and upon encountering a failure node, they can either extend the tree locally or invoke a centralized virtual coordinator (Alex) to reassign subtasks and synchronize BTs across peers. This design enables long-term cooperative execution in heterogeneous teams. Results show that our method consistently outperforms baseline approaches in task success rate, robustness, and scalability, demonstrating its effectiveness for multi-robot collaboration in complex scenarios.
LLM-Assisted Modeling of Semantic Web-Enabled Multi-Agents Systems with AJAN
Hechehouche, Hacane, Antakli, Andre, Klusch, Matthias
There are many established semantic Web standards for implementing multi-agent driven applications. The AJAN framework allows to engineer multi-agent systems based on these standards. In particular, agent knowledge is represented in RDF/RDFS and OWL, while agent behavior models are defined with Behavior Trees and SPARQL to access and manipulate this knowledge. However, the appropriate definition of RDF/RDFS and SPARQL-based agent behaviors still remains a major hurdle not only for agent modelers in practice. For example, dealing with URIs is very error-prone regarding typos and dealing with complex SPARQL queries in large-scale environments requires a high learning curve. In this paper, we present an integrated development environment to overcome such hurdles of modeling AJAN agents and at the same time to extend the user community for AJAN by the possibility to leverage Large Language Models for agent engineering.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Germany > Saarland > Saarbrücken (0.04)
RAVEN: Resilient Aerial Navigation via Open-Set Semantic Memory and Behavior Adaptation
Kim, Seungchan, Alama, Omar, Kurdydyk, Dmytro, Keller, John, Keetha, Nikhil, Wang, Wenshan, Bisk, Yonatan, Scherer, Sebastian
Aerial outdoor semantic navigation requires robots to explore large, unstructured environments to locate target objects. Recent advances in semantic navigation have demonstrated open-set object-goal navigation in indoor settings, but these methods remain limited by constrained spatial ranges and structured layouts, making them unsuitable for long-range outdoor search. While outdoor semantic navigation approaches exist, they either rely on reactive policies based on current observations, which tend to produce short-sighted behaviors, or precompute scene graphs offline for navigation, limiting adaptability to online deployment. We present RAVEN, a 3D memory-based, behavior tree framework for aerial semantic navigation in unstructured outdoor environments. It (1) uses a spatially consistent semantic voxel-ray map as persistent memory, enabling long-horizon planning and avoiding purely reactive behaviors, (2) combines short-range voxel search and long-range ray search to scale to large environments, (3) leverages a large vision-language model to suggest auxiliary cues, mitigating sparsity of outdoor targets. These components are coordinated by a behavior tree, which adaptively switches behaviors for robust operation. We evaluate RAVEN in 10 photorealistic outdoor simulation environments over 100 semantic tasks, encompassing single-object search, multi-class, multi-instance navigation and sequential task changes. Results show RAVEN outperforms baselines by 85.25% in simulation and demonstrate its real-world applicability through deployment on an aerial robot in outdoor field tests.
- Europe > Netherlands > South Holland > Delft (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Health & Medicine > Consumer Health (0.40)
- Transportation (0.31)
ORB: Operating Room Bot, Automating Operating Room Logistics through Mobile Manipulation
Qiu, Jinkai, Kim, Yungjun, Sethia, Gaurav, Agarwal, Tanmay, Ghodasara, Siddharth, Erickson, Zackory, Ichnowski, Jeffrey
Abstract-- Efficiently delivering items to an ongoing surgery in a hospital operating room can be a matter of life or death. In modern hospital settings, delivery robots have successfully transported bulk items between rooms and floors. However, automating item-level operating room logistics presents unique challenges in perception, efficiency, and maintaining sterility. We propose the Operating Room Bot (ORB), a robot framework to automate logistics tasks in hospital operating rooms (OR). ORB leverages a robust, hierarchical behavior tree (BT) architecture to integrate diverse functionalities of object recognition, scene interpretation, and GPU-accelerated motion planning. The contributions of this paper include: (1) a modular software architecture facilitating robust mobile manipulation through behavior trees; (2) a novel real-time object recognition pipeline integrating YOLOv7, Segment Anything Model 2 (SAM2), and Grounded DINO; (3) the adaptation of the cuRobo parallelized trajectory optimization framework to real-time, collision-free mobile manipulation; and (4) empirical validation demonstrating an 80% success rate in OR supply retrieval and a 96% success rate in restocking operations. These contributions establish ORB as a reliable and adaptable system for autonomous OR logistics.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
- Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.70)
Gesture-Based Robot Control Integrating Mm-wave Radar and Behavior Trees
Song, Yuqing, Tonola, Cesare, Savazzi, Stefano, Kianoush, Sanaz, Pedrocchi, Nicola, Sigg, Stephan
As robots become increasingly prevalent in both homes and industrial settings, the demand for intuitive and efficient human-machine interaction continues to rise. Gesture recognition offers an intuitive control method that does not require physical contact with devices and can be implemented using various sensing technologies. Wireless solutions are particularly flexible and minimally invasive. While camera-based vision systems are commonly used, they often raise privacy concerns and can struggle in complex or poorly lit environments. In contrast, radar sensing preserves privacy, is robust to occlusions and lighting, and provides rich spatial data such as distance, relative velocity, and angle. We present a gesture-controlled robotic arm using mm-wave radar for reliable, contactless motion recognition. Nine gestures are recognized and mapped to real-time commands with precision. Case studies are conducted to demonstrate the system practicality, performance and reliability for gesture-based robotic manipulation. Unlike prior work that treats gesture recognition and robotic control separately, our system unifies both into a real-time pipeline for seamless, contactless human-robot interaction.